New inference strategies for solving Markov Decision Processes using reversible jump MCMC

نویسندگان

  • Matthias Hoffman
  • Hendrik Kück
  • Nando de Freitas
  • Arnaud Doucet
چکیده

In this paper we build on previous work which uses inferences techniques, in particular Markov Chain Monte Carlo (MCMC) methods, to solve parameterized control problems. We propose a number of modifications in order to make this approach more practical in general, higher-dimensional spaces. We first introduce a new target distribution which is able to incorporate more reward information from sampled trajectories. We also show how to break strong correlations between the policy parameters and sampled trajectories in order to sample more freely. Finally, we show how to incorporate these techniques in a principled manner to obtain estimates of the optimal policy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Solving General State-Space Sequential Decision Problems using Inference Algorithms

A recently proposed formulation of the stochastic planning and control problem as one of parameter estimation for suitable artificial statistical models has led to the adoption of inference algorithms for this notoriously hard problem. At the algorithmic level, the focus has been on developing Expectation-Maximization (EM) algorithms. For example, Toussaint et al (2006) uses EM with optimal smo...

متن کامل

MCMC for hidden continuous - time

Hidden Markov models have proved to be a very exible class of models, with many and diverse applications. Recently Markov chain Monte Carlo (MCMC) techniques have provided powerful computational tools to make inferences about the parameters of hidden Markov models, and about the unobserved Markov chain, when the chain is deened in discrete time. We present a general algorithm, based on reversib...

متن کامل

Monte Carlo Methods and Bayesian Computation: MCMC

Markov chain Monte Carlo (MCMC) methods use computer simulation of Markov chains in the parameter space. The Markov chains are defined in such a way that the posterior distribution in the given statistical inference problem is the asymptotic distribution. This allows to use ergodic averages to approximate the desired posterior expectations. Several standard approaches to define such Markov chai...

متن کامل

A comparison of reversible jump MCMC algorithms for DNA sequence segmentation using hidden Markov models

This paper describes a Bayesian approach to determining the number of hidden states in a hidden Markov model (HMM) via reversible jump Markov chain Monte Carlo (MCMC) methods. Acceptance rates for these algorithms can be quite low, resulting in slow exploration of the posterior distribution. We consider a variety of reversible jump strategies which allow inferences to be made in discretely obse...

متن کامل

Bayesian Inference on Principal Component Analysis Using Reversible Jump Markov Chain Monte Carlo

Based on the probabilistic reformulation of principal component analysis (PCA), we consider the problem of determining the number of principal components as a model selection problem. We present a hierarchical model for probabilistic PCA and construct a Bayesian inference method for this model using reversible jump Markov chain Monte Carlo (MCMC). By regarding each principal component as a poin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009